Quantifying the Relationship between Hit Count Estimates and Wikipedia Article Traffic
نویسندگان
چکیده
This paper analyzes the relationship between search engine hit counts and Wikipedia article views by evaluating the cross correlation between them. We observe the hit count estimates of three popular search engines over a month and compare them with the Wikipedia page views. The strongest cross correlations are recorded with their delays in days. We present the results in both graphs and quantitative data among different search engines. We also investigate the predicting trends between the hit counts and Wikipedia article traffic. Keywords—hit count estimations; search engines; Wikipedia article traffic; cross correlation; positive delay, negative delay; prediction of Web hosting trend
منابع مشابه
HIT Approaches to Entity Linking at TAC 2011
This paper describes the system of HIT at the 2011 Text Analysis Conference (TAC) Knowledge Base Population (KBP) track English Entity Linking task. Based on structured and unstructured information extracted from Wikipedia, this system predicts the most probable entity that a query mention might refer to. A similarity score is assigned to the candidate entity by computing the the relatedness be...
متن کاملWeb citations in patents: Evidence of technological impact?
Patents sometimes cite webpages either as general background to the problem being addressed or to identify prior publications that limit the scope of the patent granted. Counts of the number of patents citing an organization’s website may therefore provide an indicator of its technological capacity or relevance. This article introduces methods to extract URL citations from patents and evaluates...
متن کاملWikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language
BACKGROUND Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. OBJECTIVE This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both...
متن کاملThe Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies
While Wikipedia is a subject of great interest in the computing literature, very little work has considered Wikipedia’s important relationships with other information technologies like search engines. In this paper, we report the results of two deception studies whose goal was to better understand the critical relationship between Wikipedia and Google. These studies silently removed Wikipedia c...
متن کاملThe performance of a LRU cache under dynamic catalog traffic
We propose a simple traffic model featuring a dynamic catalog to construct a theoretical estimation of the hit ratio for a LRU cache offered such a traffic regime. We validate the accuracy of our theoretical estimates by computing the empirical hit ratio for real request sequences coming from traces of the Orange network.
متن کامل